Picture for Jimmy Lin

Jimmy Lin

Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards

Add code
May 07, 2025
Viaarxiv icon

Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality

Add code
May 05, 2025
Viaarxiv icon

Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses

Add code
Apr 28, 2025
Viaarxiv icon

The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models

Add code
Apr 21, 2025
Viaarxiv icon

Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges

Add code
Apr 21, 2025
Viaarxiv icon

FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents

Add code
Apr 17, 2025
Viaarxiv icon

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

Add code
Apr 01, 2025
Viaarxiv icon

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Add code
Mar 08, 2025
Viaarxiv icon

Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation

Add code
Feb 27, 2025
Viaarxiv icon

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

Add code
Feb 25, 2025
Viaarxiv icon